Audio Segmentation using Line Spectral Pairs

نویسندگان

  • N. P. Jawarkar
  • R. S. Holambe
  • T. K. Basu
چکیده

This paper describes a technique for unsupervised audio segmentation. Main objective of the work presented in this paper is to study the performance of audio segmentation system using metric-based method. The system first classifies the audio signal into speech and nonspeech signal using variance of zero crossing rate. The feature Line spectral pair is used for automatically detecting the speaker change point. Hotelling T distance metric is used in the first stage for coarse speaker change detection. The Bayesian information criterion (BIC) is used in the second stage to validate the potential speaker change point detected by the coarse segmentation procedure to reduce the false alarm rate. Database of four files containing the speech recorded from different combinations of male and female speakers mixed with nonspeech signal such as music and environmental sound are used for segmentation. The database-file with one male and one female gives the best performance with F1 measure of 0.9474.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the duality between line-spectral frequencies and zero-crossings of signals

Line spectrum pairs (LSPs) are the roots (located in the complex-frequency or -plane) of symmetric and antisymmetric polynomials synthesized using a linear prediction (LPC) polynomial. The angles of these roots, known as line-spectral frequencies (LSFs), implicitly represent the LPC polynomial and hence the spectral envelope of the underlying signal. By exploiting the duality between the time a...

متن کامل

Interpolation of Long Gaps in Audio Signals Using Line Spectrum Pair Polynomials

This technical report addresses model-based interpolation of long signal gaps. It demonstrates that employing a modified autoregressive AR model, computed as a weighted sum of line spectral pair (LSP) polynomials, is more efficient computationally than using a conventional AR model, since longer signal gaps can be interpolated at reduced model order. Key-words: acoustic signal processing, audio...

متن کامل

The Impact of the Spectral Filter Bandwidth on the Spectral Entanglement and Indistinguishability of Photon Pairs of SPDC Process

In this paper, we have investigated the dependence of the spectral entanglement and indistinguishability of photon pairs produced by the spontaneous parametric down-conversion (SPDC) procedure on the bandwidth of spectral filters used in the detection setup. The SPDC is a three-wave mixing process which occurs in a nonlinear crystal and generates entangled photon pairs and utilizes as one of th...

متن کامل

Crosscorrelation-based multispeaker speech activity detection

We propose an algorithm for segmenting multispeaker meeting audio, recorded with personal channel microphones, into speech and non-speech intervals for each microphone’s wearer. An algorithm of this type turns out to be necessary prior to subsequent audio processing because, in spite of close-talking microphones, the channels exhibit a high degree of crosstalk due to unbalanced calibration and ...

متن کامل

Content analysis for audio classification and segmentation

In this paper, we present our study of audio content analysis for classification and segmentation, in which an audio stream is segmented according to audio type or speaker identity. We propose a robust approach that is capable of classifying and segmenting an audio stream into speech, music, environment sound, and silence. Audio classification is processed in two steps, which makes it suitable ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012